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lO Abstract 

\o 

^D ■ Consider communication over a binary-input memoryless output- 

symmetric channel with low density parity check (LDPC) codes and 
t^ I maximum a posteriori (MAP) decoding. The replica method of spin 

QQ ■ glass theory allows to conjecture an analytic formula for the average 

^^ . input-output conditional entropy per bit in the infinite block length 

limit. Montanari proved a lower bound for this entropy, in the case of 

LDPC ensembles with convex check degree polynomial, which matches 

the replica formula. Here we extend this lower bound to any irregular 

p^ ', LDPC ensemble. The new feature of our work is an analysis of the 

second derivative of the conditional input-output entropy with respect 
to noise. A close relation arises between this second derivative and 
correlation or mutual information of codebits. This allows us to extend 
the realm of the "interpolation method" , in particular we show how 
channel symmetry allows to control the fluctuations of the "overlap 
parameters" . 
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1 Introduction and Main Results 

Linear codes based on sparse random graphs have emerged as a major chap- 
ter of coding theory [1]. While the behef propagation (BP) decoding algo- 
rithm and density evolution method have been explored in detail because of 
their low algorithmic complexity and good performance, much remains to be 
understood about the optimal (MAP) performance bounds of sparse graph 
codes. Recent theoretical progress on the binary erasure channel (BEC) has 
convincingly shown that BP and MAP decoding have intimate relationships 
(see [I] and in particular [1]), but understanding this relationship for other 
channels is still a largely open problem. In fact, the replica and/or cavity 
methods of statistical mechanics of dilute spin glass models allow to conjec- 
ture an analytic formula for Hn{X\Y.), the entropy of the transmitted message 
X = {Xi, ..., Xn) conditional to the received message Y_ = (Yi, ..., F„) in the 
large block length limit n -^ +oo. The replica formula expresses the condi- 
tional entropy as the solution of a variational problem whose critical points 
are given by the density evolution fixed point equation (see [2], [3]). If one 
is to solve the fixed point equation iteratively, the choice of initial conditions 
is not necessarily the one given by channel outputs (as in standard density 
evolution) but the one which yields the maximum conditional entropy. Note 
that a byproduct of the replica formula is the determination of the maximum 
a posteriori (MAP) noise threshold, above which reliable communication is 
not possible whatever the decoding algorithm. 

The proof of the replica formulas is, in general, an open problemlj. In the 
context of communication they have been proven for a class of low density 
parity check codes (LDPC) codes on the BEC [H], [12] (see also [13] for 
recent work going beyond the BEC) and for low density generator codes 
(LDGM) on a class of channels [14J. 

A promising approach towards a general proof of the replica formulas 
seems to be the use of the so-called interpolation method first developped 
in the context of the SK model [15], [16], [17]. Consider an LDPC(r2, A,P) 
ensemble where A(x) = ^rfA^x"^, P{x) = Ylk^^x'' are the variable and 
check degree distributions from the node perspective. We will always assume 
that the maximal degrees are finite. Montanari [7] (see also the related 



-'^In a few spin glass models the replica formulas have been fully demonstrated. Re- 
markably Talagrand [5j has proven the Parisi formula with full symmetry breaking [6] for 
the Sherrington-Kirkpatrick (SK) model. In [10 it is shown that the replica symmetric 
formula holds for a complete p-spin model with gauge symmetry. 



work of Franz-Leone [S] and Talagrand- Pachenko [0]) has developped the 
interpolation method for such a system and has derived a lower bound for 
the conditional entropy for ensembles with any polynomial A(x) but P{x) 
restricted to be convex for —e<x<e{m particular if the check degree 
is constant this means it has to be even). An important fact is that these 
lower bounds match the replica solution, and are thus believed to be tight. 
Since Fano's inequality tells us that the block error probability for a code 
having length n and rate r is lower bounded by —Hn{X\Y.): an immediate 
application of the lower bound is the numerical computation of a rigorous 
upper bound on the MAP threshold. 

In the present paper we drop the convexity requirement for P{x) in the 
cases of the BEC, BIAWGNC with any noise level an in the case of general 
binary memoryless (BMS) channels in a high noise regime. In other words we 
prove the lower bound for any standard regular (so odd degrees are allowed) 
or irregular code ensemble. 

Besides the main result itself, we introduce a new tool in the form of a 
relationship between the second derivative of the conditional entropy with 
respect to the noise and correlations functions of codebits. These correlation 
functions are shown to be intimately related to the mutual information be- 
tween two codebits. The formulas are somewhat similar to those for GEXIT 
functions \X\ which relate the first derivative of conditional entropy to soft 
bit estimates. By combining these relations with the interpolation method 
we are able to control the fluctuations of the so-called overlap parameters. 
This part of our analysis is crucial for proving the general lower bound on 
the conditional entropy and relies heavily on channel symmetry. 

A preliminary summary of the present work has appeared in 



1.1 Variational bound on the conditional entropy 

Let PY\x{y\x) be the transition probability of a BMS(e) channel where e is 
the noise parameter (understood to vary in the appropriate range). We will 
work in terms of both the likelihood 



/ = ln 
and difference 



PY\x{y\o) 
.PY\x{y\i), 



t = PY\xiy\0) - PY\x{y\^) = tanh - 



variables. It will be convenient to use the notation cl{1) and co{t) for the 
distributions of / and t, assuming that the all zero codeword is transmitted 
(that is to say that CL{l)dl = CD{t)dt = pY\x{y\0)dy). 

Let V be some random variable with an arbitrary density dv{v) satisfying 
the symmetry condition dv{v) = e^dv{—v). Also let 



U = tanh"^ 



■fc-i 



JJtanhFj 



■1=1 



(1) 



where Vi are i.i.d copies of V and k is the (random) degree of a check node. 
We denote by Uc, c = l,...,d i.i.d copies of U where d is the (random) 
degree of variable nodes. Notice that in the belief propagation (BP) decoding 
algorithm U appears as the check to variable node message and V appears 
as the variable to check node message. Define the functional (we view it as 
a functional of the probability distribution dy) 



hRs[dv;KP] =^i,d,u. 

A'fl 
+ 



c=l 



c=l 



P'(l, 

-A'(l)Evn 



ln(e5 JJ(l + tanh[/c) + e~5 JJ(1 - tanhf/^ 
=1 

k n 

\n{l + Y[tanhVi) 

1=1 

ln(l + tanhl^tanhf/) 
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A'iV. 



P'(l) 
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Our main result is about the conditional entropy per bit, averaged over the 
code ensemble C = LDPC(n, A, P). 



Ec[hn] = -Ec[Hn{K\Y)] 

ill 



Definition H. We define the parameters (p an integer) 



m, 



(2p) 




E[t^P], 



m 



(2p) 
1 



d 



^E[t^P], 



(2p) 



d^ 



de 



m,- = —E[t'P] (2) 



^The subscript RS stands for "replica symmetric" because this functional has been 
obtained from the replica symmetric ansatz for an appropriate spin glass, see for example 



and say that a BMS(e) channel is in the high noise regime if the following 
series expansions 

p; I (2p) I 

EiP^Drn-- E(|)^'l'"!*'l E^^^ (3) 

p p p 

are convergent and if 



2-i)9i'<'i>E©''i< 

p>2 



For example the BSC(e) certainly satisfies H if the crossover noise pa- 
rameter is close enough to |, because E[t^^] = (1 — 2e)^^. More generaly 
any channel with bounded likehood variables satisfies H for a regime of suf- 
ficiently high noise. For channels with unbounded likehoods the condition 
will be satisfied if cl{1) has sufficiently good decay properties. But note that 
the BEC(e) which has mass at / = +cxd does not satisfy this condition since 
E[t2p] = 1 - e. However as we will see for the BEC(e) and the BIAWGNC(e) 
we do not need condition H. For these two channels our analysis can be 
made fully non-perturbative, and holds for all noise levels. 

Theorem 1 (Variational Bound). Assume communication using a stan- 
dard irregular C = LDPC(n,A,P) code ensemble, through a BEC(e) or 
BIAWGNC(e) with any noise level or a BMS(e) channel satisfying H . For 
all e in the above ranges we have, 

liminfEc[/i„] > sup /ihs[c^v; A, P] 

Let us note that this theorem already appears in [18J for the special case 
of the BIAWGNC for a Poissonnian A(x). We stress again that a formal 
calculation using the replica method yields 

lim Ec[/i„] = sup/iKsMy;A,P] 

For this reason it is strongly suspected that the converse inequality holds as 
well, but so far no progress has been made except in a limited number of 
situations alluded to before. 



1.2 Derivatives of the conditional entropy 



Our proof of the variational bound uses integral formulas for the first and sec- 
ond derivatives of Ec [/in] with respect to the noise parameter. The ensemble 
formulas follow from slightly more general ones that are valid for any fixed 
linear code. To give the formulation for a fixed linear code it is convenient 
to introduce a noise vector e = (ei, ..., e„) and a BMS(e) channel with noise 
level €{ when bit Xi is sent. When all noise levels are set to the same value 
e the channel is denoted BMS(e). The distributions of the likelihood k or 
difference domain ti representations of the channel outputs now depend on 
ej. In order to keep the notation simpler we do not explicitely indicate the 
Ei dependence and still denote them as CL^k) and cniti) respectively. 
We introduce the soft MAP estimates of bit X, 
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Px.|y(Ob) 
.Px,|y(lb) 



Ti = Px,\Y{0\y) -px,\Yil\y) = tanh 



L, 



and the soft estimate for the modulo 2 sum X,- © X 



V 



Lij 



In 



'px,(Bx,\Ao\y) 

px,(SX^\Yil\y) 



Px,®Xj\Y{0\y) - px,(BXj\Al\y) 



tanh —^ 
2 



In the sequel the notations t;~* (resp. v^^^) means that component Vi (resp. 
Vi and Vj) are omitted from the vector v- The following is known [Ij but we 
state it for completeness. A derivation in the spirit of the present paper can 
also be found in fT9l. 



Proposition 1 (GEXIT formula). For any BMS(e) channel and any fixed 
linear code we have 



d_ 

dei 



Hn{X \Y)= j 



where 



9i{U 



-^f 



dti — gi{ti 

dei 

1 - UTi 



In 



l-ti 



This formula will be used for an ensemble that is symmetric under per- 
mutation of bits and a BMS(e) channel. Using 

-rHn{X\Y) = J2^MX\Y) 



de 



i=l 



and averaging over the code ensemble C we get for the average entropy per 
bit, 



|e,[;,. 



-1 



\h-^M9Atr)] 



There are two channels where these general formulas take a simpler form 
For the BECl 

— i-) 

-i7„(X|y) = ln2(l-Et[r,]) 



and 



dhiei 
d 



dhie 
Similarly on the BIAWGNC 

d 



Ec[M=ln2(l-Ec,t[Ti]) 



Q^-2Hn{X I Y) 



■ii-^cM) 



(4) 
(5) 

(6) 



and 



We will prove 



de 



;Mhr. 



:i-Ec,t[ri]) 



(7) 



Proposition 2 (Correlation formula). For any BMS(e) channel and any 
fixed linear code we have 



92 



deidej 



Hn{X\Y)=6,, j 



dti — TT^ — gi[ti) 



1 r+l 

+ il-6ij) I / dtidt, 



dcpitj) dcD{tj) 



92{ti,tj) 



with 



92{ti,tj) =IE. 



■i~!J 



In 



i t-j-t j j j ' '^i'^j-'-i-'-j 



■'in this case the ratio in the logarithm may take the ambiguous value ^ but the formula 
is to be interpreted as ^. We will see in section [2] that in terms of extrinsic soft bit 
estimates there is an analogous expresion that is unambiguous. 



Again, for the case of interest later on, we have a BMS(e) channel and a 
linear code ensemble that is symmetric under permutations of bits, thus 



Mhn]=j_ rfti^-^Ec[<7i(ti)] (8) 



For the BECo these formulas simplify 

Q2 
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Hn{X \Y) = {1- 6,,) \n2Et[Ti, - T/Tj 



and 



(f " 

Ec [K] = In 2 ^ Ec,t [Th - TiT,] (9) 



(c^lne)^ .^, 



For the BIAWGNC 

g-^^H^iX I Y) = ^^t[{T^, - T,T,)\ (10) 



d^ „...,,,. 1, 



and 



1 " 
Mhn] = - $^Ec,t[(Ti, - TiT,)'] (11) 



(de-2)2 "L "J 2 



j=i 



Formulas (jH]) and flTTl) involve the "correlation" (Tj^ — TjTj) for bits Xj and 
Xj. The general formula ([S]) can also be recast in terms of powers of such 
correlations by expanding the logarithm (see section [3]). Loosely speaking, 
in the infinite block length limit n -^ +oo, the second derivative will be 
well defined only if the correlations have sufficient decay with respect to the 
graph distance (the minimal length among all paths joining i and j on the 
Tanner graph). Thus we expect good decay properties for all noise levels 
except at the phase transition thresholds where, in the limit n —>■ +cxd, the 
first derivative generally has bounded discontinuities, and thus the second 
derivative cannot be uniformly bounded in n. 



*The same remark than before applies here. 



1.3 Relation to mutual information 

The correlation Tij — TiTj is basicaly a measure of the independence of two 
codebits, thus it is natural to expect that it should be related to the mutual 
information /(Xj; Xj \ F). We do not pursue this issue in all details because 
it is not used in the rest of the paper, but wish to briefly state the main 
relations which follow naturaly form the previous formulas. 

The BEC(e). Take i ^ j. The chain rule implies Hn{X_ \ F) = H{XiXj \ 
Y) + H{2r'^ I XiXjY). Also H{2r'^ \ XiXjY) = H^JT'^ \ XiXjY^'^). 
Since H{X^^^ \ XiXjY^"^^) does not depend on ej,ej we have 



Hn{X I Y) = 7n^H{X,X, I Y) 



deidej deidej 

The conditional entropy on the r.h.s is explicitly eiejH{XiXj\Y^'^-') + ej(l — 
ej)H{Xi\XjY_^'i) + (1 - ei)ejH{Xj\XiY_^'^). In this expression the three 
conditional entropies are independent of the channel parameters e^ and e^. 
Thus 

^— /J„(X I Y) = H{X,X, I Y-'^) - H{X, I X.YT'') - H{X, \ X,Y_-'^) 



dtidtj 



H{Xj I Y_^'^) - H{Xj I XiY^'^) 
I{Xf,X^ I YT'') = —I{Xf,X, I Y) 



eiej 



Summarizing, we have obtained for i ^ j 



(91nej(91nej 



Hn{X I Y_) = I{Xf, X, I Y) = HT^J - T^Tj 



The BIAWGNC(e). Take i^j. We note that 

Tij = Px,x,\y{00 I y) +px,x,|i:(ll I y) -Px,x,\y{01 I y) -Px,x,\y{W \ y) 
from which it follows 



(T,, - T,T,f < 4 Y, 






2 

Px,Xj\Y{xiXj I y) -px,\Y{xi I y)pxj\Y{xj \ y) 



Applying the inequality 

^5^|p(x)-g(x)r<D(p||g) 

X 

for the Kullback-Leibler divergence of the two distributions P = Px,x \y and 
Q = Px,\Y_Px,\y, we get for i ^ j 

{T,,-T,T,y<8I{Xf,X,\y) 

Averaging over the outputs we get 



de-'de-^ 



Hn{X I Y) = Ei[(T,, - T,T,y] < 8/(X,; X, | Y) 



Highly noisy BMS channels. From the high noise expansion (see section 
[3] and the above remarks, we can derive an inequality like the preceding one, 
which holds in the high noise regime for general BMS channels. The number 
8 gets replaced by some suitable factor which depends on the channel noise. 

1.4 Organisation of the paper 

The statistical mechanics formulation is very convenient to perform many of 
the necessary calculations, but also the interpolation method is best formu- 
lated in that framework. Thus we briefly recall it in section [2] as well as a few 
connections to the information theoretic language. Section [3] contains the 
derivation of the correlation formula (proposition [2]) and other useful mate- 
rial. The interpolation method that is used to prove the variational bound 
(theorem [1]) is presented in section |H The main new ingredient of the proof 
is an estimate (see proposition [3] in section Hj) on the fluctuations of over- 
lap parameters. The proof of proposition [3] is the object of section O The 
appendices contain technical calculations involved in the proofs. 

2 Statistical Mechanics Formulation 

Consider a fixed code belonging to the ensemble C = LDPC{n,A, P). The 
posterior distribution Pxiy (a;|y) used in MAP decoding can be viewed as the 



10 



Gibbs measure of a particular random spin system. For this it is convenient 
to use the usual mapping of bits onto spins ai = (— 1)^'\ Given any set 
A C {l,...,n}, we use the notation a^ = Ilig^j- Thus cxa = (— 1)®'^'*^'. It 
will be clear from the context if the subscript is a set or a single bit. For a 
uniform prior over the code words and a BMS channel, Bayes rule implies 
PxiAxly) = ii{a) with 



11 " 



i=l 



where Yl^ is a product over all check nodes of the given code, and age = 
riieSc '^i ^^ ^^^ product of the spins (mod 2 sum of the bits) attached to the 
variable nodes i that are connected to a check c. Z is the normalization 
factor or "partition function" and InZ is the "pressure" associated to the 
Gibbs measure /i(a). It is related to the conditional entropy by 

HniX\Y) = EjlnZ] - J2 / dkcLih)^ (12) 

Expectations with respect to /u(a) for a fixed graph and a fixed channel 
output are denoted by the bracket (— ). More precisely for any A C {1, ..., n}. 

More details on the above formalism can be found for example in [TS] . 
The soft estimate of the bit Xj is (in the difference domain) 

T. = (a,) (13) 

We will also need soft estimates for Xj©Xj, i ^ j. In the statistical mechanics 
formalism they are simply expressed as 

T,, = {a,a,) (14) 

In particular the correlation between bits Xi and Xj becomes Tij — TiTj = 
{(Jiaj) — {ai){aj), which is the usual notion of spin-spin correlation in statis- 
tical mechanics. 

In section |3] (and appendices [HI [U]) the algebraic manipulations are best 
performed in terms of "extrinsic" soft bit estimates. We will need many 

11 



variants, the simplest one being the estimate of Xj when observation yi is 
not available 

Tr = tanh ^ = px^iY-my:') - Px.\Y-^ii\yn 

The second is the estimate of Xi when both yi and yj are not available 

Finally we will also need the extrinsic estimate of the mod 2 sum Xi © Xj 
when both yi and ?/j are not available, 

T^ = tanh^- = Px^eXjiY-'^i^llf'^) - Px,<BX,\Y-^^{M]f'^) 

It is practical to work in terms of a modified Gibbs average {(Tji^^i which 
means that /j = , in other words yi is not available. Similarly we introduce 
the averages {o'x)^ij, in other words both y^ and yj are unavailable. One has 

The extrinsic brackets (— )r^j and {—)^ij are related to the usual ones (— ) by 
the following formulas derived in appendix El 

{cr.U = ^^^ (15) 

and 

, > ^ (o-») - t, - {a^aj)tj + titj{aj) 

1 — {(Ji)ti — {(Jj)tj + {ai(Jj)titj 
1 — {ai)ti — {(yj)tj + {(Jiaj)titj 

3 The Correlation Formula 

A derivation of propositon [1] and of (jl]), ([6]) within the formalism outlined in 
section [2] can be found in [19} . 
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3.1 Proof of proposition [2] 

For any BMS(e) channel and linear code we have from flT2l) 



^^n(X I Y) = E,. 



' de 



rf/.r:^iA:2Z(inZ-^) 



1 



2 



The second equality follows by permutation symmetry of code bits. Differ- 
entiating once more, we get 



deidej 



Hn{X\Y)=5.,,S^ + {l-5.,,)S2 



where 



and 



^1 = E, 



62 — E^~ij 



dl, 



— -—5 — mz 

9e? ^ 2 • 



(18) 



(19) 



dcL{li)dcL{lj 



dldU "^"^'^"^^"^^"" {\nZ- -) 



00 ' ^ dei dej 



2 



First we consider Si. Let 



e 2 



0"fc 



(T cec 



fcT^i 



be the partition function for the Gibbs measure (— )~j introduced in section 
|2]and consider 



In- 



ln(e2'^»). 



Using the identity 
we get 



h„. ii 1 + tiGi 

e 2 "^^ = e 2 



/,; 



l + ti 



(20) 



In Z - - = In Z^i + In , 

2 V 1 + t 



When we replace this expression in the integral flT9|) we see that the con- 
tribution of In Z^i vanishes because this later quantity is independent of /j. 
Indeed 

^2 



dh 



da 



In 



n2 /•+00 
Z^i = InZ^j — ^ / dliCLik) = 



13 



since cl(/j) is a normalized probability distribution. Then, using (IT5|) leads 
to 



Si 



dU — ^^E 



1 



96? 



In 






In 



l + U 
1 -ti{ai) 



l-U 



(21) 
(22) 



which (because of (1131) ) coincides with the first term in the correlation for- 
mula. 

Now we consider the term 5*2. Notice that 






Thus we can rewrite 5*2 as 

^2 = E, 



+- 9c^(/,) /, d 
^ de- 2de- 



dliCLik) = 



^l~^j 



dcLik) dcL^lj 



li t-j . 



^ 9e,- ae, ^ 2 2^ 



Let Z^ij = J2a ricGC 2 (■'- "*" "^^c) rifca^j J ^ ^ '^'' be the partition function for the 
Gibbs measure (•)~ij, and consider 



ln-^ = ln(e2"»+^"^). 



-jj 



Using again fl20l) we get 



\Y\Z — — — — = \r\Z +lnl "^ ii\^i)^ij + ^j\^j)^ij + titj{aiaj)r^ij 
2 2 ~*^' V 1 + ti + tj+titj 

As before the contribution of In Z^jj vanishes because it is independent of li 
Ij. Similarly we have 

dhdh — - — ^ — — ln(l + ti{ai)r^ij) = same with i and j exchanged 



''^""'J 



dti dej 







„ „ dc(li) ddlj) , ,_ , .,.,., , 

dlidln^ TT^^ ln(l + ti) = same with i and j exchanged 



oo ' ^ dei dej 
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Using these four identities then leads to 



62 — lE;~ij 
X In 



Lbb^Clij 

1 Ce, oej 



(23) 



To get the formulas in terms of usual averages we use the relations flT6l) . ([T 
Hence 



^2 = E, 



7~»j 



^^ r/tcft ^^D^^i) ^^d(^o) I f ^~ ^'^^^^ ~ ^J^^J^ + titj{aiaj) 



I ^ ^ d€i dej 



1 - ti{(^i) - tj{aj) + titj{(yi){(7j) 

(24) 



Because of flT3l) and flT^ this coincides with the second term in the correlation 
formula. The proposition now follows from flTSl) . fl2^ and ([23]). 

3.2 Expressions in terms of the spin-spin correlation 

The BEC. From CD{t) = (1 — e)5{t — 1) + e5(t), the second derivative in 
terms of extrinsic quantities (formulas fl2Tl) and fl23|) ) reduces to 



d^ 



deidej 



Hn{x I r) = (1 - 5,,)E^ 



ijj'^t-'i 



In 



1 + (o-i)~y + (cri)~ii + {<yi)r^ij{<yj)r 



'«J/ J 



There are various ways to see that for the BEC any Gibbs average {a a) or 
{(JA)^ij takes values in {0, 1}. A heuristic explanation is that bits (or their 
mod 2 sums) are either perfectly known or erased. A more formal explanation 
follows from a Nishimori identitjo combined with the Griffith-Kelly-Sherman 
(GKS) correlation inequality [18]. For example, E[(crA)^] = E[(o"yi)] (Nishi- 
mori) and {(7a) > (GKS). Thus {o'a){^ — (o"a)) is a positive random variable 
with zero expectation and is therefore equal to with probability one. These 
remarks imply that 



^ Hr,{X\Y) = —{l-6i,)E^ 



deidej 



eiej 



In 



1 + (0"i) + {(Tj) + {(Tjaj) 

l + (ai) + (a,) + (a,;)(a,-) 



^We will use various such identities. A proof of their most general form can be found 
in [18] . A general reference is [21] . 
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Note that in deriving the last expression we used the fact that Zj = oo {Ij = 
oo) imphes that ctj = +1 {cXj = +1) which makes the logarithm term equal 
to zero. From the previous remarks we also have 

ln(l + (a,) + (a,) + (a,a,)) =(ln2)((a,) + (a^) + {a,aj)) 

+ (In 3 -2\n2){{ai){aj) + {(ri){aiaj) + {(Jj){aiaj)) 
+ (5 In 2 - 3 In 3) (a^) (a^) {ai(Tj) 

and 

ln(l + {a,) + {a,) + (a,) (a,-)) = (In 2) ((a,) + (a,-)) 

The difference of the two logarithms is simplified using the following four 
Nishimori identities, 

^t[{(ri){a,)] = Et[{(ri){(r,(r,)] = Ei[(a,)(a,a,)] = Ei[(a,a,)(a,)(a,-)] 

Finaly we obtain the simple expression 

= ^(1-5.,)%[T,,-T,T,] 
tttj 

Let us point out that the second GKS inequality (for the BEG) implies that 
{(Ji(jj) — {(Ti){(Tj) > 0, thus the correlation takes values in {0, 1} and we have 
Et[T,^ - T,T,] = E, [(T,, - T.Tjf] . 

The BIAWGNC. From the explicit form 



1 „ „M^ 



CLil) = /- ^ exp 



e 



'2\2 



V2^^ V 2e-2 
one can show that the correlation formula reduces to 

rHn{X\Y)=Et[{{a,a,)-{a,){aj)f 



defdef- 

= E,[(T,,-T,T,)'] 
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Otherwise diffrentiating (IT^ thanks to 



''V') _ ^_l , |iy,^(,, 



and using integration by parts also leads to this simpler form. This route is 
much simpler and the details can be found in [T8] . 

Highly noisy BMS channels. We use the extrinsic form of the correlation 
formula given by fl2Tl) and (12^ . First we expand the logarithms in 5*1 and 5*2 
in powers of tj and tj and then use various Nishimori identities. After some 
tedious algebra (see Appendices [B] and [C]) we can organize the expansion in 
powers of the channel parameters ([2]) . In the high noise regime this expansion 
is absolutely convergent. To lowest order we have 

g^H^iX I Y) = 6,,S, + (1 - 6.,)S2 



^6,,m2^'\Ei[{a,)'] - 1) + ^(1 - 6,,)[mi^'^]% 
^5,,m2^'\Et_[T^] - 1) + ^(1 - %)[m/2)]2E, 



{{(Tiaj)- {ai){aj)y 



+ ... 



\J-ij J-iJ-j) 



+ . . . (25) 



The second derivative of the conditional entropy is directly related to the 
average square of the code-bit or spin-spin correlation. 

4 The Interpolation Method 

We use the interpolation method in the form developed by Montanari. As 
explained in [^ it is difficult to estabhsh directly the bounds for the stan- 
dard ensembles. Rather, one introduces a "multi-Poisson" ensemble which 
approximates the standard ensemble. Once the bounds are derived for the 
multi-Poisson ensemble they are extended to the standard ensemble by a 
limiting procedure. The interpolation construction is fairly complicated so 
that it helpful to briefly review the simpler pure Poisson case. 

4.1 Poisson ensemble 

We introduce the ensemble Poisson-LDPC(n, 1 — r, P) = V where n is the 
block length, r the rate and P{x) = ^^ PkX^ the check degree distribution. 
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A bipartite graph from the Poisson ensemble is constructed as follows. The 
graph has n variable nodes. For any k choose a Poisson number irik of check 
nodes with mean n(l — r)Pk. Thus graph has a total of m = ^j^rrtk check 
nodes which is also a Poisson variable with mean n(l — r). For each check 
node c of degree fc, choose k variable nodes uniformly at random and connect 
them to c. One can show that the left degree distribution concentrates around 
a Poisson distribution Ap(x) = e'^'''^'*^^"''^^^'"^^ In other words the fraction 
A/ of variable nodes with degree / is Poisson with mean P'(l)(l — r). 

The main idea behind the interpolation technique is to recursively remove 
the check node constraints and compensate them with extra observations U 
distributed as ([T]) where dy is a trial distribution to be optimized in the 
final inequality. One can interpret these extra observations as coming from a 
repetition code whose rate is tuned in a such a way that the total design rate 
r remains fixed. More precisely let s G [0, 1] be an interpolating parameter. 
At "time" s we have a Poisson-LDPC(n, (1 — r)s,P) = Vs code. Besides 
the usual channel outputs k, each node i receives Cj extra i.i.d observations 
U^, a = 1, ..., Cj, where e, is Poisson with mean n(l — r)(l — s) (so the total 
effective rate is fixed to r). The interpolating Gibbs measure is 

n 

/^^(^) = ^ n ^(1 + -^^) n ^'^^'^' '''^" (26) 

Here Ylc i^ ^ product over checks of a given graph in the ensemble Vg- At 
s = 1 one recovers the original measure while at s = (no checks) we 
have a simple product measure (corresponding to a repetition code) which 
is tailored to yield the replica symmetric entropy hjis[dv',A-p,P] (up to an 
extra constant). 

The central result of [7J is the sum rule 

Mhn] = hns[dv;Av,P] + / Rn{s)ds (27) 

Jo 

Let us explain the notation. The first term on the right hand side hp(^s,p[dv'i A-p, P] 
is the replica symmetric functional of section [T] evaluated for the Poisson en- 
semble. The remainder term Rnis) is 



oo _. 



[PiQ2p) - P\q2p){Q2p - q2p) - P{q2p)) 



2p,s 



with q2p = Ev[(tanhK)^^] and Q2p the overlap parameters 

ft>.=^i:-.'"-f' ■■-.'*' (28) 

1=1 

Here a^ , a = 1,2, . . . ,2p are 2p independent copies (rephcas) of the spin ctj 
and {—)2p,s is the Gibbs bracket associated to the product measure (rephca 
measure) 

a=l 

4.2 Multi-Poisson ensemble 

The multi-Poisson-LDPC(n, A, P, 7) = AiV ensemble, is a more elaborate 
construction which allows to approximate a target LDPC(n, A,P) ensemble. 
Its parameters are the block length n, the target variable and check node 
degree distributions A(x) and P{x) and the real number 7 which controls 
the closeness to the standard ensemble. We recall that variable and check 
node degrees have finite maximum degrees. The construction of a bipartite 
graph from the multi-Poisson ensemble proceeds via rounds: the process 
starts with a high rate code and at each round one adds a very small number 
of check nodes till one ends up with a code 

with almost the desired rate and degree distribution. A graph process Qt 
is defined for discrete times t = 0, ...,tmax, imax = L^'(l)/7j — 1 as follows. 
For t = 0, ^0 has no check nodes and has n variable nodes. The set of 
variable nodes is partitioned into the subsets V/ of cardinality nKi for every I 
and every node i G V/ is decorated with / free sockets. The number di{t) keeps 
track of the number of free sockets on node i once round t is completed. So 
for t = 0, Qq has no check nodes and each variable node i G V; has (ij(0) = / 
free sockets. At round t, Qt is constructed from Qt-i as follows. For all k, 
choose a Poisson number m]. of check nodes with mean n'~fPk/P'{l). Connect 
each outgoing edge of these new degree k check nodes (added at time t) to 
variable node i according to the probability Wi{t) = y-'^.7^_\i - This is the 
fraction of free sockets at node i after round t — 1 was completed. Once 
all new check nodes are connected, update the number of free sockets for 
each variable node di{t) = di{t — 1) — Ai{t). where Aj(t) is the number 
of times the variable node i was chosen during the round t. For n ^ 00 
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this construction yields graphs with variable degree distributions A^(x) (the 
check degree distribution remains P{x)). The variational distance between 
A^(x) and P{x) tends to zero as 7 — ;► 0. 

The interpolating ensemble now uses two parameters (t*,s) where t* G 
{0, ...,tmax} and < s < 7. For rounds 0, ...,t* — 1 one proceeds exactly 
as before to obtain a graph Qt^-i- At the next round t^, one proceeds as 
before but with 7 replaced by s. The rate loss is compensated by adding Cj 
extra observations for each node i, where Cj is a Poisson integer with mean 
n{j — s)wi(t*). The round is ended by updating the number of free sockets 
(ij(t*) = (ij(t* — 1) — Ai(t^) — ej(t*). Finally, for rounds after t^, + 1, ...,tmax no 
new check node is added but for each variable node i, Cj external observations 
are added, where Cj is a Poisson integer with mean n'-)Wi{t^). Moreover the 
free socket counter is updated as di{t) = di{t — 1) — ej(t). Recall that the 
external observations are i.i.d copies of the random variable U (see ([1])). 

The interpolating Gibbs measure ^it,,s{^ has the same form than fl26l) 
with the appropriate products over checks and extra observations. Let h^^^ 
the conditional entropy of the multi-Poisson ensemble MV (corresponding 
to t* = tjnax and s = 7). Again, the central result of [TJ is the sum rule 



tmax J- /t-y 



EMv[hn,j]=hRs[dv;A^,P]+ > / RniU, s)ds + On{l) (29) 



i,=0 







Explanations on the notation are in order. The first term hRs,'y[dv'i ^75 P] is 
the replica symmetric functional of [T] evaluated for the multi-Poisson ensem- 
ble. The remainder term Rn{ti., s) is given by 



00 _. 



[PiQ2p) - P'{q2p){Q2p - q2p) - Piq2p)),^,^^, 

(30) 



where q2p = Kv[(taiih.Vy^] as before and Q2p are modified overlap parame- 
ters 

n 

Q2p = Y. ^.(i*)^.(i*)^i'Vf ^ ■ ■ ■ af ^) (31) 

Here as before al , a = 1, 2, . . . , 2p are 2p independent copies (replicas) of 
the spin cxj and (— )2p,t.,s is the Gibbs bracket associated to the product 
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measure 

=1 



n 



The overlap parameter is now more complicated than in the Poisson case 
because of the (positive) terms Wj(t*) and Xj(t^,). Here Xj(t^,) are new i.i.d 
random variables whose precise description is quite technical and can be 
found in [7]. The reader may think of the terms Wi{t^,)Xi{t^,) as behaving 
like the - factor of the pure Poisson ensemble overlap parameter (l28l) . More 
precisely the only properties (see Appendix E in |i7j) that we need are 



n 



^Wi{U) = 1, ¥[wi{Q <-]>!- e-^" (32) 



n- 



i=l 

and 

< Xi{Q < X, E[x''] < Ak (33) 

for any finite k and finite positive constants A, B, A^ independent of n 
(they may depend on some of the other parameters but this turns out to be 
unimportant). Finaly we use the shorthand ]Es[— ] for the expectation with 
respect to all random variables involved in the interpolation measure. The 
subscript s is here to remind us that this expectation depends on s, afact 
that is important to keep in mind because the remainder involves an intgral 
over s. When we use E (without the subscript s; as in (l33ll for example) it 
means that the quantity does not depend on s. In the sequel the replcated 
Gibbs bracket {—)2p,u,s is simply denoted by {—)s- There will be no risk of 
confusion because the only property that we us is its linearity. 
In [7j it is shown that 

Ec[hn] = EmvKJ + 0(7') + o„(l) (34) 

where 0(7^*) is uniform in n (6 > a numerical constant) and o„(l) (depends 
on 7) tends to as n — > +00. 

In the next paragraph we prove the variational bound on the conditional 
entropy of the multi-Poisson ensemble, namely 

\immiEMv[hn,-y] > hRs[dv;A^,P] (35) 
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Note that here o„(l) again depends on 7. By combining this bound with 
and taking hniits 



hminfEc[^n] = hmhminf Exp[/i„^J > lira hjis[dv', A^, P] = hRs[dv', -A, P] 

n— >+oo 7— >0 n— >+oo ' 7^0 

(36) 
The main theorem [1] then follows by maximizing the right hand side over dy- 

4.3 Proof of the Variational Bound (135]) 



In view of the sum rule (1291) it is sufficient to prove that lim inf„^+oo Rn{t*, s) > 
0. In the case of a convex P considered in [^ this is immediate because con- 
vexity is equivalent to 

PiQ2p) - P{q2p) > P'iq2p)iQ2p - q2p) 

Note that P{x) = Y2k Pk^^ is anyway convex for a; > since all P^ > 0. So 
if do not assume convexity of the check node degree distribution we have to 
circumvent the fact that Q2p can be negative. But note 

n 

{Q2p)=Y.^^{U)XM{ai'^aP■■■ai''^) 

i=l 
n 

= 5^^,(t.)X,(t.)(a«)(afV--(^f^) 

n 

= ^w,(t,)X,(t,)(a,)2p>0 

Therefore we are assured that for any P (i.e not necessarily convex for a; G M) 
we have 

Pi{Q2p)) - P{q2p) > P'iq2p)i{Q2p) - q2p) (37) 

and the proof will follow if we can show that with high probability 

PiQ2p) ^ P{{Q2p)) 

The following concentration estimate will suffice and is proven in section [51 

Proposition 3. Fix any 5 < \. On the BEC(e) and BIAWGNC(e) for a.e 
e, or on general BMS(e) satisfying H , we have for a.e e, 

= (38) 



n 
lim / dsfs 



\PiQ2p)-Pi{Q2p)s)\>'^ 
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Here ^s{X) is the probability distribution Ks{Ix)s- 

This proposition can presumably be strengthened in two directions. First 
we conjecture that hypothesis H is not needed (this is indeed the case for the 
BEC and BIAWGNC). Secondly the statement should hold for all e except 
at a finite set of threshold values of e where the conditional entropy is not 
differentiable, and its first derivative is expected to have jumps (except for 
cycle codes where higher order derivatives are singular). Since we are unable 
to control the locations of theses jumps our proof only works for Lebesgue 
almost every e. 

We are now ready to complete the proof of the variational bound 0351) . 

End of Proof of ([35]). From (JUD and ([3SD 

n 

|<32p| <X1^^(^*)^^(^*) ^^ (39) 

and 

E,[(Qy.]<Afc (40) 

Combined with g2p < 1, this implies (since the maximal degree of P is finite) 
that 

^s[{PiQ2p) - P'iq2p)iQ2p - q2p) - Piq2p))s] < Ci (41) 

for some positive constant Ci. The only crucial feature here is that this 
constant does not depend on n and on the number of replicas 2p (a more 
detailed analysis shows that it depends only on the degree of P{x)). 

Now we split the sum (130|) into terms with 1 < p < n^ (call this contri- 
bution Ra) and terms with p > n^ (call this contribution Rb), where 5 > 
is the constant of proposition [31 For the second contribution (14 ip implies 

p>n^ 

For the first contribution we write 

Ra = Y. 2p(2p_l) ^-K^(Q2p)). - P{{Q2p)s)] 
p<n^ 

+ E 2p(2p_i) ^t-P(<^^^))-) - P'il2p)i{Q2p))s - q2p) - P{q2p)] 



p<n' 
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In this equation, the second sum is positive due to (P7|) . Thus we find 
Rnii*, s) = Ra + Rb 



,.2p(2p — ^ -"" 



Below we use proposition [3] to show that for almost every e in the appropriate 
range 

1™ r^' E O (0^ ^M {nQ2p))s - P{{Q2,)s)] = (43) 



p<n 

which implies by Fatou's lemma 



liminf \^ / i?„(t^,, s)(is > 



and thus proves (I5S]) for almost every e in the appropriate range. A general 
convexity argument allows to extend this result to all e in the same range. 
Indeed convexity arguments imply that both sides of the inequality fl35l) are 
continuous functions of cl. To show continuity of the left hand side we use 
inequality fISB]) in AppendixO it implies that there exists a positive number p 
(independent of e and n) such that ^Es[/in,^] > —p. Therefore Es[/i„,^] + |e^ 
is convex in e; so the liminf „^+oo is also convex and thus continuous on any 
open e set. To show continuity of the right hand side we first note that for 
each dv, hpts is a linear functional of the channel distribution cl(/); thus 
the sup^^ is a convex functional of Ci(/); thus it is continuous in any open 
e where cl{1) varies smoothly in e (this last point can be made more precise 
using tools from functional analysis). 
Let us now prove fj43l) . First we set 

F2p=\{P{Q2p))s-P{{Q2p)s)\ 



^At this point one could use arguments involving physical degradation if BMS(e) is 
degraded as a function of e. But wc take a more direct route that does not assume 
physical degraddation as a function of e 
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and use Cauchy-Schwarz and then (1401) to obtain 

E[F2p] = E, [F2pl^,^< 2. ] + E, [F2pl^ > 2. ] 

n° n° 

for some positive constant C2 independent of n and p (depending only on the 
degree of -P(x)). Thus 

J, ^^ 2p{2p - 1) ^ '^J 

p<n'' 

- ^5 Z^ 2p - 1 Vo ^- 2p(2p - 1) ' L 2p - ^5J 

p<.n° 

In the second inequahty we have permuted the integral with a finite sum and 
used Cauchy-Schwarz. Finaly we can apply proposition [3] and Lebesgue's 
dominated convergence theorem to the last sum over p, to conclude that (143!) 
holds. 



5 Fluctuations of overlap parameters 

In this section we prove proposition [3l The proofs are done directly for 
the multi-Poisson ensemble. We start by a relation between the overlap 
fluctuation and the spin-spin correlation. 

Lemma 1. For any BMS(t) channel there exists a finite constant C3 inde- 
pendent of n and p (dpending only on the maximal check degree) such that 



P. 



\P{Q2p)-P{{Q2p)s)\>-, 



n" 



C ( \ 

(44) 
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fc-1 



1=0 



(45) 



Proof. Using the identity 

Q2p ~ {Q'2p)s = {Q'2p ~ {Q2p)s) 2_^ Q2p {Q2p) 

and (15^ we get 

\P{Q2p) - {PiQ2p)s\ = \Q2p- {Q2p)s\\J2PkJ2Q2p'''{Q2pys\ 

k 1=0 

<\Q2p-{Q2p)s\Y,^PkX^-' 

k 

<P\x)\Q2p-{Q2p)\ 

Here x is the bound in fl39l) . Therefore applying the Chebycheff inequahty 



fc-i 



\P{Q2p)-P{{Q2p)s)\> 



2p' 



n" 



^25 

- 4p2 



P'ixr{{Ql)s-{Q2p)l) 



(46) 



From the definition of the overlap parameters it follows that 

n 
n 

< 2j9 ^ x^Wi{U)wj{U)[{ai(jj) - {(Ti){aj)) 
Substituting in fH6l) and applying Cauchy-Schwarz to X]ii^s[~] "^^ S^^ 

M\P(Q2p)-Pi{Q2p)s)\ > ^] <— (5^E,[x^p'(x)V(t*)S(^*)' 

Hj=l 



1/2 



S,j=l 



1/2 



From fl32|) . fl33l) it is easy to see that for any i,j 



Es[x^p'{x)^w,{uywj{uy] < 



ci 



n^ 
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where C3 is independent of n. It follows that 

P. 



\P{Q2,)-P{{Q2p)s)\>f, 



^25^1 / " \ 1/2 

^25-1 / " X 1/2 



2p Cs[Y.^s[i{am)s - {cr^)s{cT,)sr]j (47) 

In the last equality we have used the symmetry of the ensemble with respect 
to variable node permutations. D 

Denote by hn^^it^, s) the entropy of the Ht,,s interpolating measure. Note 
that this should not be confused with the multi-Poisson ensemble entropy 
hn^'y (which corresponds to t^ = t^^,^ and 5 = 7). 

Lemma 2. For the BEC and BIAWGNC with any noise value and for general 
BMS(e) channels satisfying H we have 

^E,[(((Ti(T,), - {ai)s{a,)sf] < F(e) + ^(6)— E,[/i„,^(t„s)] (48) 

where F{e) and G{e) are two finite constants depending only on the channel 
parameter. 

The proof of lemma [2] is based on the correlation formula of section [TJ 
These are true for any linear code ensemble so they are in particular true for 
the interpolating (t*,s) ensemblql|. For the BEC and BIAWGNC we have 
already shown the two equalities (jH]) and (ITT1) : thus the inequality (HSj) is 
in fact an equality for appropriate values of F and G. The case of general 
(but highly noisy) BMS channels is presented in appendix O A converse 
inequality can also be proven by the methods of appendices [B] and [Cl 

Proof of proposition^^ Note that for all points of the parameter space (e, s) 
such that the second derivative of the average conditional entropy is bounded 
uniformly in n the proof immediately follows from fl47p . (HSj) (and the last 



^in fact one has to check that the addition of J2l=i ^a to li does not change the 
derivation and the final formulas. For this it suficcs to follow the calculation of section [3] 
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inequality before that one) by choosing S < ^. However, in the large block 
length limit n —* +00, genericaly the first derivative of the average condi- 
tional entropy has jumps for some threshold values of e (these values depend 
on the interpolation parameter s). This means that for these threshold values 
the second derivative cannot be bounded uniformly in n. Since we cannot 
control these locations we introduce a test function ipi^)'- iioii negative, in- 
finitely differentiable and with small enough bounded support included in 
the range of e satisfying H. We consider the averaged quantity 



Q 



dti){t) I ds¥s 




|p(Q2p)-p((g2p))i>5 



(49) 



Writing ifj^e) = >/'?/'(£) ^/^(e) Cauchy-Schwarz implies 



7 
Q< I ds 
'0 



detP{e)Fs 



\P{Q2p)-P{{Q2p))\> 



2p 



n" 



2\ 1/2 



Combining this inequality with (H7I) and (H51) we get 



Q< 



n 



25- i 



2p 

25- i 



-c. 



d'' 



n 



2p 



ds( I de^{e){F{e) + G{e)—E,[hn,,{U,s)]) 

d 



1/2 



Cs / dsi / de^Pie)F{e)- / rfe-(V^(e)G(e))-E,[V^(t„ s 



de 



de 



1/2 



Note that from the bounds in appendix ICl -F(e) . G{e) and G'{e) are integrable 
except possibly at the edge of the e range defined by H. This is not a 
problem because we can take the support of tp{e) away from such points 
or alternatively take a ilj{e) which vanishes sufficiently fast at these points. 
Moreover the first derivative of the average conditional entropy is bounded 
uniformly in n and s (see appendix [D]) by a constant k{e) that has at most a 
power singularity at e = 0, and again this is not a problem. Thus by choosing 
< S < J we obtain 

hm Q = 

n— »+oo 

Applying Lebesgue's dominated convergence theorem to convergent subse- 
quences (of the integrand of / deip{e) in fll^ ) we deduce that 



deip{e) lim 



dsFs[\P{Q 



2p) 



Pi{Q2p)s)\>^]=0 



nl 
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which iinphes that along any convergent subsequences, for almost all e 

(50) 



lim 


r ds¥. 


\P{Q2,) - 


-P{{Q2,)s) >^ 


k^+oo 


/o 







as long as 5 < \. Now we apply this last statement to two subsequences that 
attain the lim inf and the lim sup (on the intersection of the two measure one 
e sets). This proves that the lim„^+oo exists and vanishes. D 

6 Conclusion 

The main new tool introduced in this paper are relationships between the 
second derivative of the conditional entropy and correlation functions or mu- 
tual information between code bits. This allowed us to estimate the overlap 
fluctuations in order to get a better handle on the remainder. Some aspects of 
our analysis bear some similarity with techniques introduced by Talagrand 
[T7] but is independent. One difference is that we use specific symmetry 
properties of the communications problem. 

We expect that the technique developped here can be extended to re- 
move the restriction to high noise (condition H). Indeed the only place in 
the analysis where we need this restriction is lemma [2l For the BEC and BI- 
AWGNC the lemma is trivialy satisfied for any noise level (with appropriate 
constants). Another issue that would be worthwhile investigating is whether 
the related inequalities of paragraph 11.31 and the converse of lemma [2] can be 
derived irrespective of the noise level. 

The next obvious problem is to prove the converse of the variational 
bound (theorem [1]) . 

For this one should show that the remainder vanishes when dy is replaced 
by the maximizing distribution of hjis[dv'i^iP]- This program has been 
carried out explicitely in the case of the BEC and the Poisson ensemble [12] . 
It would be desirable to extend this to more general ensembles and channels 
but the problem becomes quite hard. However a similar program has been 
succesfuly carried out for a p-spin model with gauge symmetrjcl (see [ID]). A 
solution of these problems would allow for a rigorous determination of MAP 
thresholds and would extend our understanding of the intimate relationship 
between BP and MAP decoding. 



^In the present context gauge symmetry and channel symmetry are equivalent 
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A Appendix A 

We prove the identities (fT5|) . (TT6!) , (IT7|) . By definition 






and 



(T C 


11 






e 2 "^^ = e 2 





Thus 



and plugging the identity 



l-U 

in the brackets immediately leads to (TT5l) . For the second and third identities 
we proceed similarly. Namely, 

(cTiC 2'"»e 2^^) 



and 



Plugging 









(e 2'^'e 2'^i) 






in the brackets, leads immediately to (IT6l) and (IT71) . 
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B Appendix B 



We indicate the main steps of the derivation of the full high noise expansion 
for 

Q^Hr^iX I Y) = 6,,Si + (1 - 6,,)S2 

The expansion for Si is given by (ISTj) and that for ^2 by (!M|) . They are 
derived in a form that is suitable to prove lemma [2] of section [5] (see appendix 
Ol . For this later proof we need to extract a square correlation at each order 
as in (l5^ . This is achieved here through the use of appropriate remarquable 
Nishimori identities, and in order to use these we take the extrinsic forms 
(EI]) and ([23]) of Si and S2. 

Let us start with Si which is simple. Using the power series expansion of 
ln(l + x) we have 

This yields an infinite series for 5*1 which we will now simplify. Because of 
the Nishimori identities 

2p~ii _ wr+spn TH. .r/^.\2p-ii _ w r/^ .\2P1 



we can combine odd and even terms and get 

+00 (2p) 

m 



s-E2Ri3T)(«.-l(->Sl-i) 



(51) 



This series is absolutely convergent as long as 



+00 ^(2p) 

TT < +00 



■^-^ m^ 






-2p{2p- 

which is true for channels satisfying H. 

In the rest of the appendix we deal with 5*2 which is considerably more 
complicated. However the general idea is the same as above. First we use 
the expansion of ln(l + x) to get 

1 I t + {O'i/r^ijti + [O'j/r^ij'tj + {O'iO'j/^ijtitj I _ T _ TT _ TTT (en) 
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where 



^(_l)P+i/ \P 

p=i 



p 



J2^ (-^^p+^ 



p=i 



p 






p=i 



p 



V 



We expand the multinomial in I 

kjhlkj 



E P' fka+kc±kf,+kc I \ka I \ki, I \kc 



ka+ki,+kc=p 



and subtract the terms II and III. Then only terms that have powers of the 
form tft'-j with k,l > 1 will survive in (!52l) . Moreover because of the identities 
E[tf -1] = E[tf ] and E[tf-^] = E[tf] we find for ^2 

+00 
82= Yl rnf"^^?'^ {Too + Toi + Tio + Tn) 

k>l>l 

+00 

+ J2 ^I'^^i'^ (^00 + Ui + T[, + T{0 (53) 



l>k>l 



with (we abuse notation by not indicating the {kl) and {ij) dependence in 
the T and T' factors) 



T, 



kA 



2fc-K+2i-A 
p=2k—K 



(_1)P+1 



p! 



p (p - (2/ - A))!(j9 - (2A; - /t))!(2A; - K + 2/ - A - j9)! 



X E 



t~»j 



; \p-{2Z-A)/ ip-(2fc-re)/ \2fc-K+2i-A-p 



and 



T'^^ = exchange k, 1 and k, A and i, j 

The next simplification step occurs by using the Nishimori identity for the 
expectation in the above formula 



E 



i~».? 



(^.)::m^.)::i;-(^.^.): 






ii-jf^ij 



{<'^^i^^^JD^^A^^^{^Jm^^^3) 



\m3 
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and using ai G {±1}, to "linearize" the terms {ai(Tj)"^^a^'^aJ^^. Tedious but 
straighforward algebra then yields 



2k+2l-l 



_np+i 



(p+l)! 



^ ^'^ ^_^p{p + I) {p + I - ^ky.ip + 1 - 2l)\{2k + 21 - p - 1)\ 



X E- 



't~«j 



\P-2'+l/^.\P-2fc+l((^.CT,)^'^+2'-l-P 



{^^)l-''^'{^,)l 



nj 



A similar formula obtained by exchanging fc, / and i,j holds for XIkA-^kA- 
Replacing these sums in (135]) yields a high noise expansion for 82- 

However this is not yet pratical for us because we need to extract a general 
square correlation factor ((o"iO"j)^jj — (cri)~ij(crj).^ij) . The fact that this is 
possible is a "miracle" that comes out of the Nishimori identities that were 
used. Setting 

X = {cri)^ij{aj)^ij, Y = {(riaj)^ij 

and using the change of variables m = p — 2k+l the last expression becomes 
{k>l) 



Ej. 



/ \2k-2l zt /97\ 



m=0 



One can check that this is equal to 



2k-2l r:^2l~2 



(2/)! 



dx^i- 



X'*^-\X - Y) 



21 



The latter can be checked by first expanding (X — F)^' and then differenti- 
ating. On the other hand one can use the Leibnitz rule 



d- 



21-2 



dX 



21-2 



X'^-^X - Y) 



21 



21-2 
r=0 



9/ _ 9\ f)r ff.l-2-r 

^^ 2\ C x^k-2 ^^_ ^ .^. {X-Y) 



21 



r / dX"-' 



QX^k- 



to find that the last expectation above is equal to 

21-2 



(X - Yf{a,)%^' Yl ^rikX^^X - Y) 



2l-r-2 



r=0 
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where 



^ {'';')[2lU2k-2U_,. 



(20! 



[m]r = m{m — 1) ■ ■ ■ {m — r + 1) 



We define Aqu = |. We proceed similarly for the terms with k < I. Finaly 
one finds 



k>l>l 
21-2 

X 

r=0 



{(Ti(Tj)^ij - {(ri)^ij{ai)^^j (a,) 



y2k-2l 



r=0 ^ 

+ y^ idem with k, 1 and i, j exchanged 



2/-2-r' 



(54) 



l.>k>l 



Let us now briefly justify that the series is absolutely convergent for 
channels satisfying H. We Note the following facts: Arik < ( ~ )2^^~^ 
and 2^'^~^3^'~^ < (|)2A:+2«-4 for fc > / together with the version with kj 
exchanged. It easily follows that 



' ' -625 - 



{(yi(yj)^ij - {c^i)^ij{(^i)r 



E(i) 

fc,/>i 



2k+2l. (2k) (21) 

\m\ m\ 



(55) 



Thus the series for ^2 is absolutely convergent as long as 



'2' 



E/J\2p| (2p)| ^ 

p=i 



-oo 



Note that we have not attempted to optimize the above estimates. 

C Appendix C 

We prove lemma [2] for highly noisy general BMS channels. For this we use 
the high noise expansion derived in appendix [Bl There it was derived for 
a general linear code ensemble, and this is also the framework of the proof 
below. Of course the result applies to the interpolating ensemble of lemma 
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[2J Note that the the final constants F{e) and G{e) do not depend on the 
code ensemble but only on the channel. 

Consider equation ([8]) for ^Ec,t[/in]- By the same estimates than those 
for 5*1 in appendix [B], the first term on the right hand side is certainly greater 
than 



E 



\m. 



(2p)| 



-2p{2p~r 



-A 



To get a lower bound for the second term we consider the series expansion 
given by that for 5*2 in fl5^ . In that series we keep the first term correspond- 
ing to k = I = 1, namely 



i(mf))^5:E,,. 



jyi 



(o-icrj)^y - (o-i)^y(o-j) 



lj\Oj/^lj 



B 



and lower bound the rest of the series (/c, /) ^ (1, 1) by using estimates of 
appendix [Bl More precisely this part is lower bounded by 



(«|(|(|)^'|"'S"|)^4("'?')=^ 






(orio-j)^ij - {cri)^ij{(yj)^ij 



-c 



Putting these three estimates together we get 



de 



X^Ec,t[/iJ >~A + B-C 



(56) 



As long as the noise level is high enough so that (see H) 
5:(|)*|mf"|<(y2-l)(|)Vfl 

p=2 

the inequality fl56|) implies 






{cricrj)^ij - {(yi)^ij{crj)^ij 



<F(e) + G(e)^Ec,t[/i„] (57) 
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for two noise dependent positive finite constants F{e), G{e). 

The final step of the proof consists in passing from the extrinsic average 
(~")~ii i^ ^^^ correlation to the ordinary one {—)ij- This is achieved as 
follows. From the formulas flT6l) and flT7|) we deduce that 



(ajai) - {(rj){ai) = {{ajai)^^j - {(yj)^ij{cri)^ij)Ri 



with 



R. 



V 



(1 - {ai)ti - {aj)tj + {aiaj)Utjy 



< 



:i-mi-t'] 



a function that depends on all log-likelihood variables. 
Thus we have 



2d2 



{{(TjCXi) - {aj){ai)f = [{ojai)^ij - {aj)^ij{ai)^ij) R 



16 



Taking now the expectation Ec^t we get 



Ec,t 



(^i^i) - (^j)(^*) 



2n 



X Ei-t 



16 



2n 



Since tj, tj are independent we get 
16 






'l-t2)2n _t2p 



16 E 



[{i-t^y 



n-i^YH-t^y 



16 E 



'-p>0 



2p 



16 



p>o 



(2p) 



(58) 



which converges for highly noisy channels satisfying H. The result of the 
lemma follows by combining (1571) and (!58l) . The constants F(e) and G(e) are 
equal to F(e) and G(e) divided by the expression on the right hand side of 
the last inequality. 
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D Appendix D 

We prove the boundedness and positivity of ^Es[/i„,^(t*, s)] which is needed 
in the proof of lemma [2l 

Lemma 3. For the BEC and BIAWGNC with any noise level, and any EMS 
satisfying H, there exists a constant k{e) independent of n, 7, t* and s such 
that 

Q<j^s[hna{t*.s)]<k{e) (59) 

For the EEC we can take k{t) = ^-^ and for the EIAWGNC k{e) = ^. 
For general EMS channels satisfying H the constant remains bounded as a 
function of e (i.e. in the high noise regime). 

Here we have stated the lemma for the multi-Poisson interpolating en- 
semble which is our specific need. However as the proof below shows it is 
independent of the specific code ensemble and the bound depends only on 
the channel. 

Proof. We will use the GEXIT formula of lemma [H Since the proposition 
applies for any linear code it also applies for the interpolating ensemble of 
interest here. In the case of the BEC and BIAWGNC we have (see ([5]), ([7]) 

-Es[hn,.,{t,,s)] = -^{1-Es[{cxi)s] 

and 

-Es[hn,^{t,,s)] = -{1 -Es[{ai)s] 

The bounds of the lemma follow immediately since — 1 < cti < 1. 

For highly noisy BMS channels we proceed by expansions. For this reason 
we have to use the "extrinsic form" of the GEXIT formula (analogous to fl2T]) ) 

Af r. .. „M _ r' .. 9cnit,] 
de 



-Es[hn,^i[t^,s)\= / dti — Es^^t^ Inl — — J 



Expanding the logarithm and using Nishimori identities (as in the expansion 
of 5*1 in appendix [B] we obtain 



d „ r, , X, v—^ "nv 



-E.[/..„(t., .)] = 5: ^^^^^E,.,[(^i)?^i - 1] 
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The positivity follows from m{ ^ < [T] and — 1 < o"i < 1. The upper bound 
(and absolute convergence) follow from condition H. In particular we get 



, (2P) I 



^ ^ ^ 2p{2p - 1) 
which is independent oi n, 7, t^, and s. O 
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